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This paper develops a quasispecies model that incorporates the SOS response. We consider 
a unicellular, asexually replicating population of organisms, whose genomes consist of a single, 
double-stranded DNA molecule, i.e. one chromosome. We assume that repair of post-replication 
mismatched base-pairs occurs with probability A, and that the SOS response is triggered when 
the total number of mismatched base-pairs exceeds Is- We further assume that the per- mismatch 
SOS elimination rate is characterized by a first-order rate constant nsos- For a single fitness peak 
landscape where the master genome can sustain up to I mismatches and remain viable, this model 
is analytically solvable in the limit of infinite sequence length. The results, which are confirmed by 
stochastic simulations, indicate that the SOS response does indeed confer a fitness advantage to a 
population, provided that it is only activated when DNA damage is so extensive that a cell will die 
if it does not attempt to repair its DNA. 
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Author Summary: Genetic repair is currently a major 
area of experimental research in molecular and systems 
biology, because the breakdown of genetic repair is believed 
to play a crucial role in phenomena such as the emergence 
of cancer and the emergence of antibiotic-resistant strains 
of bacteria. As with many other research areas in biology, 
mathematical models can be expected to play an increas- 
ingly important role in understanding various genetic re- 
pair mechanisms in unicellular organisms. In this vein, I 
have developed an analytically solvable model describing 
the evolutionary dynamics of a unicellular population ca- 
pable of undergoing the SOS response. The SOS response 
is a repair mechanism that has been receiving a consid- 
erable amount of attention recently, primarily because it 
is a repair mechanism that is highly error-prone, and so 
it is somewhat paradoxical that such a repair mechanism 
could confer a selective advantage. To my knowledge, this 
paper is the first of its kind to mathematically model the 
evolutionary aspects of the SOS response, and so I be- 
lieve that this work provides an initial, and much-needed, 
theoretical foundation for understanding the role of this 
repair mechanism. 



I. INTRODUCTION 

Genetic repair is an essential component of cellular 
genomes. Without mechanisms for repairing damaged 
and mutated DNA, genomes could not achieve sufficient 
information content to code for the variety and complex- 
ity of modern terrestrial life [T|. 

Genetic repair mechanisms fall into two main cate- 
gories: Those that correct base mis-pairings during the 
replication cycle of a cell, and those that repair mutated 
and damaged DNA during the growth (G) phase of the 



'Electronic address: |emanuelt@bgu.ac.il| 



cellular life cycle [T]. 

Two important examples of the first class of repair 
mechanisms are DNA proofreading and mismatch repair 
(MMR). DNA proofreading is a repair mechanism that is 
built into the DNA replicases themselves. During daugh- 
ter strand synthesis, an erroneously matched base is ex- 
cised, and a second attempt at a base pairing is made 
[T]. Mismatch repair also removes erroneous bases from 
the daughter strand, but does this shortly after daughter 
strand synthesis [T]. 

Two important examples of the second class of repair 
mechanisms are Nucleotide Excision Repair (NER) and 
the SOS response [1]. NER protects a cell from dam- 
age due to radiation, chemical mutagens, and metabolic 
free radicals by removing damaged portions of the DNA 
strand and using the other, presumably undamaged 
strand as a template for re-synthesis of the excised re- 
gion [T]. 

The SOS response is a genomic repair mechanism that 
only activates when there is extensive damage to the cel- 
lular genome. When DNA damage is sufficiently exten- 
sive, the cell stops growing, and the SOS repair pathways 
attempt to restore complementarity to the genome [T]. 
The SOS response only takes effect when DNA damage 
is so extensive that it may be impossible to use undam- 
aged template strands to correctly re-synthesize damaged 
portions of the genome. Thus, although this means that 
the SOS repair mechanism is highly error prone, it is evo- 
lutionary advantageous for the cell to repair the genome 
and risk fixing deleterious mutations, than it is to leave 
the damaged genome unrepaired 

In recent work with quasispecies models of evolution- 
ary dynamics, quasispecies models [2 O |J considering 
the first class of repair mechanisms have been studied 
[3 El [3 IH]- In addition, semiconservative replication, 
including semiconservative replication with imperfect le- 
sion repair (i.e. not all base-pair mismatches are elimi- 
nated) , has been considered [9l HOl HH [E] • Additional 
effects, such as multiply-gened genomes, as well as multi- 
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ply chromosomed genomes, have been considered as well 

mm . . 

This paper continues the theme of incorporating var- 
ious details characteristic of cellular genomes by devel- »^ 



oping a quasispecies model that takes into consideration 
the SOS repair mechanism. The model is highly simpli- 
fied, and therefore only a first step in developing proper 
evolutionary dynamics equations with SOS repair. Nev- 
ertheless, because our model is analytically tractable, we 
believe it is a useful and important initial approach to 
mathematically modeling the evolutionary aspects of the 
SOS repair pathway. 



s/ 
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II. MATERIALS AND METHODS 
A. Definitions and model set-up 

We consider a unicellular population of asexually repli- 
cating organisms, whose genomes consist of a single DNA 
molecule, i.e. one chromosome. The genome may then 
be denoted by {a, a'}, where a, a' denote the two strands 
of the DNA molecule. If the genome is of length L, 
then we may write a — 61 . . . a' = b'l . . .h'^ where 
each base 6^, b'^ is chosen from an alphabet of size 
S (usually =4). If bi denotes the base complemen- 
tary to bi (for the standard Watson-Crick bases, the 
pairings are Adenine{A) — Thymine{T), Guanine{G) — 
Cytosine{C)), and a denotes the strand complementary 
to a, then a — b^ . . .bi. This follows from the antiparal- 
lel nature of double-stranded DNA [T] . 

We let n{o-,<T'} denote the number of organisms with 
genome {<7, cr'}, and we assume that replication occurs 
with a genome-dependent, first-order rate constant, de- 
noted K{o-.(j'}. The set of all 

'^{(T,cr'} defines the fitness 

landscape. 

The semiconservative replication of the DNA genomes 
happens in three stages: 

1. Strand separation, whereby each strand of the chro- 
mosome separates to act as a template for daughter 
strand synthesis. 

2. Daughter strand synthesis. We assume a genome 
and base-independent mismatch probability e. This 
error probability e includes all error correction 
mechanisms, such as proofreading and mismatch 
repair, that are active during the replication phase 
of the cell. 

3. Lesion repair, where any post-replication mis- 
matches are removed. Here, there is no longer 
the parent-daughter strand discrimination that was 
available during daughter strand synthesis, so in 
contrast to DNA proofreading and mismatch re- 
pair, lesion repair has a 50% chance of removing 
the mutation, and a 50% chance of communicat- 
ing it to the parent strand and fixing the mutation 
in the genome. We also do not assume that lesion 



FIG. 1: Illustration of the SOS repair mechanism being con- 
sidered in this paper. A DNA genome with two base-pair 
mismatches is restored to a fully complementary genome in 
two repair steps, where during each step a single mismatch 
(i.e. lesion) is eliminated. The first lesion is repaired cor- 
rectly, so that the original base-pair of the master genome 
strands (solid blue lines) is restored, while the second lesion 
is repaired incorrectly, so that a mutation (dotted red lines) 
becomes fixed in the genome. 

repair is perfectly efficient, so that we consider a 
genome and base-independent probability A of re- 
moving a mismatch. We call A the lesion repair 
efficiency. 

In our simplified model, the SOS response is triggered 
if a given genome has at least Is mismatches. The repli- 
cation rate of all cells undergoing SOS repair is zero. We 
assume that removal of mismatches is catalyzed by an 
enzyme that binds to a mismatch and then eliminates 
the mismatch at a rate characterized by a first-order rate 
constant ksos- Therefore, the probability that a given 
mismatch is eliminated over an infinitesimal time interval 
dt is given by Ksosdt. 

In this paper, we will consider the behavior of the 
model in the limit of infinite sequence length. If /z = eL is 
held constant as L 00, then the probability of an error- 
free daughter strand synthesis is given by (1 — e)^ e~^. 
Therefore, fixing fi in the infinite sequence length limit is 
equivalent to fixing the per-genome replication fidelity. 

Finally, we assume that the fitness landscape is defined 
by a master genome {ao,ao}. Specifically, we define a 
genome {a, a'} to be viable, with a first-order growth 
rate constant fc > 1, if it has fewer than / mismatches, 
and if it does not differ from {c7Q,ao} by any fixed mu- 
tations. Otherwise, the genome is unviable, with a first- 
order growth rate constant of 1. 

B. Symmetrized population distribution 

We can develop the infinite sequence length equations 
for our model, assuming an initially prepared clonal pop- 
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ulation consisting entirely of the genome {(Jo,d-Q}. Be- 
cause, during replication, only a finite number of muta- 
tions are possible, at any time the population will con- 
sist of a distribution of genomes {a, a'} where cr, cr' dif- 
fer from either do and cto in at most a finite number 
of spots. Thus, given two gene sequences cti, (72, if we 
let Dh{(Ji, (72) denote the Hamming distance between ai 
and a2 (i.e. the number of sites where cri and (T2 dif- 
fer), then either Dj{{a,ao) and Dh{(j', <Jo) are finite, or 
I?//((7, (To) and Z^if ( cr', (7o) are finite. 

As a result, we can define a strand ordering (cr, cr') for 
a genome {cr, a'}, where it is understood that a is a finite 
Hamming distance from erg and (Tq is a finite Hamming 
distance from cto. 

A given genome (cr, cr') may then be characterized by 
four parameters Ic, II, Ir, and Ib- We let Ic denote the 
number of sites where a and cr' are both complementary, 
yet differ from the corresponding bases in erg and (Tg . We 
let denote the number of sites where a differs from erg , 
but cr' is identical to (Tg. We let Ir denote the number 
of sites where a is identical to erg, but cr' differs from (Tg. 
Finally, we let I b denote the number of sites where a and 
a' differ from erg and (Tg, but are not complementary (for 
an illustration of these parameters, see |H [TU]). 

Note that the fitness landscape depends only on Ic, II, 
In, and Ib, and hence the fitness of a given organism may 
be denoted by ^(i^^ij^^i^^i^), where for our single-fitness- 
peak landscape we have i^(Ic.Il,Iii,Ib) = fc if = and 
II + Ir + Ib <i I, and 1 otherwise. 

By the symmetry of the fitness landscape, and by the 
symmetry of the initial population distribution, we can 
group all genomes of identical Ic, II, Ir, and Ib, and 
derive the dynamical equations of the symmetrized pop- 



ulation distribution. We therefore let : 



denote 



the total number of organisms in the population whose 
genomes are characterized by the parameters Ic, II, Ir, 
and In, and we let n^,f'~'f\ , ^ denote the total number 
of organisms in the population undergoing the SOS re- 
sponse, whose genomes are similarly characterized by the 
parameters Ic, II, Ir, and Ib- The corresponding popu- 



lation fractions are denoted ^(/cii./jj.iB) 
respectively. 



and z, 



(SOS) 

{Ic-Il,Ir.Ib)' 



C. Dynamical equations 



To develop the dynamical equations for both the 



and the 



,(SOS) 
"{Ic,Il,Ir,Ib) 



quantities, we begin by 



considering a genome (cr, cr') , characterized by the param- 
eters Ic, II, Ir, and Ib- 

We first consider the case where this genome is not un- 
dergoing the SOS response. Then, due to the semicon- 
servative nature of DNA replication, this genome is being 

destroyed at a rate given by -i<'(Ic,Il,Ir,Ib)'^{Ic-Il.Ir.Ib)- 
This genome, however, is produced by other genomes in 
the population, as a result of replication. So, consider 
some other genome (cr", cr'") which produces (cr, cr') upon 



replication. This can either occur via the cr" template 
strand, the a'" template strand, or both. 

If the {a" ,a"') genome is characterized by the pa- 
rameters Iq, I'l, l'^, and Ig, then a" differs from erg in 
I'c + I'l + I'b spots. Because sequence lengths are infi- 
nite, the probability of a mismatch in one of these spots 
during daughter strand synthesis is 0. In the remaining 
sites, let I'l denote the number of mismatches that are 
not corrected, and I2 denote the number of mismatches 
that are repaired, but fixed as a mutation in the genome. 
Then the resulting genome (cr, cr') is characterized by: 

1. ic = i'S + i'L + i'h + i2 



2- It 







3. Zfl = I'l 

The probability of a given set of mutations cor- 



responding to l'{, 



1" 



IS e 



i'i+i2(l - A)'"(A/2)'2(1 - 
eA/2)^-'c-'i-'B-'i-'2 . The term (1 - e + 
eA/2)^~'c"'t^'B^'i ^'2 arises as a probability that the 



remaining L ~ I'l 



1" — 1" 



I'l — ^2 sites on cr" remain 



identical to erg, and the corresponding daughter strand 
sites are identical to (Tg . The per-site probability of this 
is the probability of error-free daughter strand synthe- 
sis, 1 — e, plus the probability of a mismatch, times A, 
the probability that complementarity is restored, times 
1 /2, the probability that complementarity is restored cor- 
rectly. 

The degeneracy is given by (L-Z^-Z^-Z^)!/(Z'/!Z^'!(L- 
/" — /" 

length the total probability becomes, 



c~'-L~^B~ ^1 ~ ^2 ) •) , so in the limit of infinite sequence 



{L~i'S-i'i-i'i,y- 



iliqiiL-i^^ I'l -I'l, -11-^)1 



;e'i+'2 (1 _ A)''i'(^)'2 (1 _ (1 _ x/2)e)^-''<^-^'^-''^-'"-^'^ 



-^[M(l-A)f(^)'^'e"(^"^/^)'' 

If (cr, cr') is generated by a'", then we have, 

1. — l'^ + I'l^ + I'l, + I'l 

2. Il = I'l 

3. h. = 



(1) 



4. h 







We also obtain an overall transition probability of 
l/{l'l\l'l\)[fi{l - A)]'"(AiA/2)'2 e-(i-V2)p. 

It is important to note from the cr" and cr'" results 
that genomes with Ib > cannot be generated during 
replication. Since SOS repair eliminates mismatches, it 
follows that a population where Ib is initially for all 
genomes will always have a population where Ib = 0- 
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Therefore, we may assume in subsequent derivations that 

Ib, I'b ^rc 0. 

Furthermore, note that strands a" that are a finite 
Hamming distance away from (Jq can only generate 
daughter genomes where = 0, while strands a'" that 
are a finite Hamming distance away from ctq can only 
generate daughter genomes where Ir = 0. Therefore, we 
may also assume in subsequent derivations that II, Ir 
are not simultaneously > 0. 

Then for the genomes {a, a') generated by a", we have 
Ic = i'c + ^2 ' ^iid ^-R = ^1 • Therefore, the restriction 
on {a", a'") is that < I2 < h, < I'l < - I'i, and 
I'c = — I'l ~ ^2 • Note that there is no restriction on I'J^. 

Then for the population number n(;p^o,;B,o)) we have a 
contribution from the a" strands of 

^ E pi^T^'' ^ ^ Hic-i'i-i'i,vi,vkm ^ 

n{ic-vi-V', 11,1^,0) 

(2) 

A similar expression is obtained for the population 



number n(;^,,;j^.o.o)- except Zj? is replaced with II, and 
the roles of I'l and l'^ are exchanged. 

It should also be noted that, by the symmetry of the fit- 
ness landscape, we have that = n(;c,ijj,;i,JB)- 
Another way to note this is that, for a given genome 
(tJ, cr'), if we change the ordering of the strands so that 
the first strand is of finite Hamming distance to ctq, 
and the second strand is of finite Hamming distance 
to (To, then the genome {cr, cr'} must be represented 
as (o-',cr), and is characterized by the parameters Ic, 
Ir, II, and Ib- If n^^^i^^i^^j^) denotes the number of 
genomes characterized by Ic, II, Ir, and Ib, with re- 
spect to the (acCTo) strand ordering, then since there 
is a one-to-one correspondence between genomes (cr, a') 
with parameters Ic, II, Ir, Ib with respect to the first 
ordering, and genomes {o;a') with parameters Ic, Ir, 
II, Ib with respect to the second ordering, it follows 
that n(i^,i^,i^,iB) = Hic,Ir,Il,Ib)- However, since the 
fitness landscape is invariant under strand ordering, we 
have n(^ic,iL,iR,iB) = ri(ic,iL,iR,iB)^ so that n(;c,;i„iK,;B) = 

HIc,Ir,Il,Ib)- 

Taking into consideration the contribution to 
"((c, 0.0,0); we may put everything together and obtain, 
after changing variables from population numbers to 
population fractions. 



dzi^i^fioo) _ (SOS) , f, r X (SOS) N 
= -(«--(;c,o,o,o) + K(i))^(;c,o,o,o) + ksos(2(;^,o,i,o) + " ^ico)Z(ic-ho,m! 

Ic lc^h,c 00 ^ ^ 
^2e-^*(l-A/2) E E E ]^iY^''-'''^ilc-h.c-h,h,h,0)Hlc-h.a-h,h,h,0) 

dz(icfi^i'>o,o) , , 

-^^ = -(,«(/c,o,/',o) + '«(rjj2(;c,o,i',o) 

^ Ic 'c-'i,c 00 . 

h_c=o h=0 h=0 

for Z' = 1, . . . , - 1 

~~Vcfi£jy) _ tLjlAt ('S'os') , r X (SOS) x ,/ (SOS) , (sos) 

- I^SOSl 2 \^(lc, 0,1' +1,0) + °'cOj^(;p_i,o,;'+l,0)'' ' ^(lc,0,l' ,0)1 W^(ic,0,/',0) 

forr = l,...,Zs-l 

"-^(ic,o,''>o,o) _ riJlii' ('S'OS') _i_ (^ _ x ^ (^^s) ^ _ (sos) i _ (sos) 

— KSOSI 2 ^^(ic,0,/'+l,0) + y'- "'c0j2^(;^_i_o,i'+l,0)^ * ^(Ic ,0,1' ,0)1 '^y'') ^(Ic ,0,1' ,0) 
h,c=0 li=0 (2=0 

for V > Is (3) 



where = X;^=oE^=oE/^=o'«(ic,ii,,i«.o)2((c,ii,/H,o) 

Z];c=o('*('c,o,o,o)2(;c, 0,0,0) + '^J2i'=i '^(ic ,o,i' ,o)Z(ic,o,i' ,o)) 
is the mean fitness of the population. 



Note that we do not write down the dynamical equa- 
tions for Z(;f,.;',o.o) or ^[i^Po a), since they are redundant. 
The factor of 1/2 appearing in the SOS terms arises 
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from the fact that when a mismatch is removed, it either 
corrects the daughter strand synthesis error, or it fixes 
the mismatch as a mutation in the genome. In the former 
case, the value of Ic remains unchanged, while in the 
latter case it is incremented by 1. 

It should be noted that this factor is missing in 
the contribution to 0,0,0) from SOS repair. The 
reason for this is that this contribution comes from 



and 



^(ic,0,l,0)' ^Oc, 1,0,0)' ^(/c-l',0,1,0)' ^(ic-M,0,0)- -'^^'^ 



(SOS) 



(SOS) 



(SOS) 



ever, because 

(SOS) 



(SOS) 



^{SOS) 
^(ic, 1,0,0)' 



^{SOS) 
^(ic-1, 0,1,0) 



(ic, 0,1,0) 

^li'^-i 10 0)' may combine identical terms and elimi- 
nate the factor of 1/2. 

The factor of Z' + 1 and /' in front of the ksos rate con- 
stant arises from the fact that the fraction of genomes 
whose SOS enzymes are bound to a mismatch is pro- 
portional to the total number of mismatches, hence the 
resulting SOS rate constant is proportional to the total 
number of mismatches. 



III. RESULTS AND DISCUSSION 
A. Steady-state behavior 

1. Definitions and basic equations 

To obtain the steady-state behavior of our model, we 
begin by introducing some definitions that will allow us 
to simplify the calculations. 

1. zi — 2(0,0,0,0)- 

2- Z2 — J2\'^i Z{o,OM,0}- 

3- Z3 = 2(o,o,r,o)- 
4. Z4 = Z]i^=i ^(ic, 0,0,0)- 

o- ^5 = Z^ic=i l^i'=i ^(/c,o,;',o)- 

=1+1 2(/c,0,/',0)- 



r> OC ST^ / S — 1 

D- ^6 — L^lc = l l^l'=l+l ■ 



7 7^ 
'■ ^01' 



(SOS) _ (SOS) 



"(0,0, F,0)- 



(SOS) _ Y^oo (SOS) 



'1/' ^ 



ic=o -^(;c,o,;',o)- 



q (SOS) _ Y-00 (SOS) 

^- ^0 ~ l^l' = l ^01' 

10. z(^os)^j2Zi4r'^- 



where we set Z = /s — 1 whenever I was previously defined 
as > Zs. The differential equations for zi, Z2, z^, Z4, Z5, 
and zq are readily derived. From the equations, 

00 

5Z l^(0..0M:0)Z{0fi,l2:0) = fc^l + + Z3 (4) 

l2=0 



00 Ic ^c~h,c 00 



H I] H H 7^(^)'''''«(ic-/i,C"ii,ii,i2,0) X 
'l.C- ^ 



Zc=Oii.c=0 /i=0 l2=0 
^(Ic-h.c-h.luhfl) 

= e'^^/'^[kzi + 2kz2 + 2z3 + Z4 + 2z^ + 2zq 
we obtain, 
dzi 



(5) 



(SOS) 
I^SOSZqI 



= -{k + R{t))zi + 2e-^^^-^/^^kzi + kz2 + Z3] 

[SOS) 

n 

{k + R{t))z2 
0-l)e-^(i- 

(1 + K{t))z3 

A) - fi{n, 
(1 + R,{t))zi 



dt 

+« 

dZ2 

~dt 

A) - i)e-^(i-V2) [k,^ + + Z3] 

dz3 
dt 

A) - /z(m, X))e-^^^-^/^^[kz, + kz2 + Z3] 

dzn 
~dt 

_^2e-A'(i-^/2)[eA'A/2(^^^^ _l_ 2fc^2 -l- 2z3 + Zi + 2z5 + 2ze) 
-{kzi + kz2 + Z3)] + K5os[2z^f'^'^'' - 

dZ5 

~dt 



-(1 + R{t))z5 



-t-(/,(M,A)~ 1)6-^(1-^/2) X 
[e^'^''^{kzi + 2kz2 + 2z3 + 24 + 2z5 + 2z6) 
— {kzi -f kz2 + 23)] 
dzQ 
~dt 



= -{l + R{t))zQ 



+(/;,-i(M,A)-/KM,A))e-^(i-^/2)x 
[e^'^''^{kzi + 2kz2 + 2z3 + Z4 + 2z5 + 2z(i) 
-{kzi + fcz2 -I- Z3)] 

where we define A) = ELo[m(1 " A)] V^! TO]. 
We also have, 



(6) 



dz, 



(SOS) 
01' 



I'+l 



= Ksos'-^Zof+T - (I'l^sos + K(t))2or' 



dt 



(SOS) 



for Z = 1',. .. ,Zs - 1 

dzQi, I + 1 (SOS) f,, , (SOS) 

= KSOS ^—^Or + l - '^SOS + K{t))z'oi, 

+ ^[^^(1 - X)fe-^^'-^/'\kz, + kz2 + 
for /' > Is 



(SOS) 

ir 



= A^sos(Z' + - (^''^sos + sW)zif, 



dt 



(SOS) 



(SOS) 



for /' = 1,. .. ,;s - 1 

, (SOS) 

^^^^^ = «sos(Z' + - (^''^sos + m)z\T 

+ - A)]''e-'^(i-^) [fczi + 2kz2 + 2z3 + Zi + 2z^ + 2zq\ 

for /' > Is (7) 
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We can add these equations to obtain, 

iSOS) 



dt 



-KSOSZii - K{t)z 



{SOS) 



+(l-e-''(i-^)/,,_i(/z,A))x 

[kzi + 2kz2 + 2z3 + Zi + 2z5 + 2zq] (8) 

For the purposes of computing the mean fitness at 

steady-state, we can simplify the system of equations 
somewhat by defining 24 = Z4 + 2z^ + 2zq. We obtain, 



dzi 
~dt 



-(1 + K{t))zi + 2e-'^<^'-^/'^fl,-^{^x, A) X 

[e''^/^(fczi + 2kz2 + 2z3 + Z4,) - {kzi + kz2 + Z3)] 

+^sos[2z[r^ - zir'h (9) 

For consistency of notation, in what follows we shall sim- 
ply let Z4 denote Z4. 

2. Determining z'i^°^\ z'(l"'^\ and z^^°^^ 

To obtain the steady-state behavior of this system of 
equations, we begin by first solving for the steady-state 
of the population undergoing SOS repair. 

For = 1, . . . , — 1 we have at steady-state that, 

{SOS) _ , Mi^M^.CSOS) .^Qx 

%'+i - r + i^' + ^sos ^ ^ ^ 

which gives. 



Ssos) 



= '^m'+''^ur' (11) 

ts! ,7 , I^SOS 



^ois - in^ ^^^^ 

For I' > Is, we have, 

Xsos) _ _2_ Rit^ool (SOS) 



'oi'+i - 1, + ,^ ^^^^ 
2 1 



[Mi-a/x 



«sos {I' + 1)! ' 
e-^^'-^/^\kzi + kz2 + Z3] (12) 

This expression has the form of the recursion relation, 
Xn+i = CLnXn ^ bn- Usiug mathematical induction, it is 
possible to prove that x„ = a„_i x • • • x uoXq — a„_i x 

• • • X aibo - a„_i x • • • x 0261 a„_i6„_2 - 

Therefore, 



= '^E(^"+'^)x 
■ i"=i 



I^SOS 



^^{SOS) _ ^g-Ml-V2)(fc^^ + + ^3) ^ 

KSOS 

n '^'-'K X 



( =1 ^ Ksos ' 



-1 k 



m(1 - A) 



^ 2(1 a I /" I 

fe=0 i"=i ^I'S + t + ^g^g J 



(13) 



where we define W^^i = 1. 

If we define gi'{^i,\\R{t = co), ksos) = 

llr'^l ;» I g(t = oo) >^ Z^A; = 11/" = 1 Mi^^ ' ^'^^'^ 

"SOS "SOS 

imposing the requirement that limj/^oo -^or'^^'' = gives, 
at steady-state, that, 

Ksos4f°^' = 2e-^^'-^/^\kzi + kz2 + zs] x 

9is{l^/'2^^\Kt = '^)^i^sos) (14) 
Using a similar argument, we obtain, 

Ksosz^l°^^ = e->'(^-^\kz^ + 2kz2 + 2z3 + Z4] x 

gis{lJ.,X;R{t = 00), Ksos) (15) 

For the steady-state value of z^^'^^\ we have, using 
the identity R{t) = kz\ + 2kz2 + 2zs + Z4, 

^{SOS) ^ ^ _ g-A.(l-A) ^ 

(/!s-i(m,A) + gis{n,X;K,{t = 00), ksos)) 

(16) 



3. Computing K{t = 00) 



Plugging our expressions for KsosZof^^"^ and 

(SOS) 

KsosZii into the steady-state population fractions 
equations, we obtain, 

= -{k + K{t = oo))zi 

^2e-^(i-V2)(i + g,g(|, A; K{t = 00), Ksos))[kzi + kz2 + Z3] 

= -[k + K{t = 00))Z2 

+{fi{f,, A) - l)e-''(i-^/2)[fc^i + kz2 + zs] 

= -(1 + K{t = 00))Z3 

+(/,,_! (m, A) - M^i, A))e-''(^-^/2)[fc^i + kz2 + Z3] 

= + K{t = oo))zi 

-F2e-^(i-^)(/js-i(/K, A) + gi^in, A; K{t = 00), Ksos)) x 
[kzi + 2kz2 + 2zz -I- Z/^ 

-2e-''(i-^/2) (/;,_! (m, A) + gis{\,\; Rit = od),ksos)) x 
[kzi + kz2 + Z3] (17) 
From these equations we may derive the equality. 



k{zi + Z2) + Z3= [k{zi + Z2) + zsjt 



-U.{l-X/2) 



[k 



1 + (/x/2, A; K(t = 00) , Ksos) + fi (m. A) 
k + R{t = 00) 



+ 



/ig-i(/x. A) - fiiii, A) 
l + R{t = oo) ^ 



(18) 



Below the error catastrophe, when zi, Z2, Z3 are not all 
0, we may cancel k{zi + Z2) + Z3 from both sides of the 
equation and re-arrange to obtain, 

R{t = 00)^ — A(/i, A; R{t = 00), Ksos)K,{t = 00) 
-B{n,X;R{t = <x),Ksos) = (19) 
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where, 

A; R{t = oo), Ksos) = fc[e-^(i-*)(l + fiif^, A) 
+25/s(|'^;'^(* = oo),'«sos)) - 1] 
+e-^('-'H/is-i(M,A)-/KA^,A))-l 

A; «(i = ^),Ksos) = fc[e-^(^-*ni + fls~li^^, A) 
+25j3(|,A;S(t = oo),Ksos))-l] (20) 

Beyond the error catastrophe, the mutation rate is suf- 
ficiently high that the selective advantage for remaining 
localized about the Ic = genomes disappears, so that 
zi, Z2, and Z3 drop to 0. The relevant steady-state equa- 
tion is then, 

= -(l + K(t = oo))z4 + 2e"^(i"^) X 

(//s-i(a*, A) + A;k(< = 00), ksos))z4 

(21) 

which may be solved for liit — 00) to give, 

R{t = 00) = 2e-^(i-^) X 

[/is-i(M,A) -l-gis(^,A;K(i = 00), ksos)] - 1 (22) 

The error catastrophe occurs at the mutation rate for 
which the two expressions for the mean equilibrium fit- 
ness become equal. 

4- Limiting Cases 

Case 1: A — 1 

When A = 1, we get for Is > that gi^l^i, X;R{t = 
oo),Ksos) — 0, and that fi^^i{^,X) — 1. There- 
fore, above the error catastrophe, we obtain R^t = 
00) = 1. Below the error catastrophe, we have 
A{^M, 1; R{t = c»), Ksos) = A:(2e-^/2-l)-l, Bifi, 1; R{t = 
00), Ksos) = fc(2e"''/2 - 1), giving K{t = 00) = 
fc(2e-^/2 - 1). These results are in agreement with the 
solution of the semiconservative quasispecies equations 
with perfect lesion repair j^. 

Case 2: = 00 

When Is ~ 00, then gi^{^, X;R{t — 00), Ksos) ~ 0- 
Below the error catastrophe, we have A{fj,, A; R{t — 
00), ksos) = fc[e-Mi-V2)(i + /,(^,A)) - 1] - 
/i(At,A)e~^(i-^/2) + - 1, and B{fi,\;R{t = 

00), Ksos) = fc(e-''(i-^/2) _,_ g-MA/2 _ ly Above the er- 
ror catastrophe, we have R{t — 00) = 1. Both results 
are in agreement with the semiconservative quasispecies 
equations with arbitrary lesion repair efficiency |10] . 

Case 3: ksos ^ 00 

When KSOS ~* 00, then gi^{fi, X;R{t = 00), ksos) = 
gMCi-^') _ fi^_^(^i_i^ Xj. Above the error catastro- 
phe, we get that R{t = 00) = 1. Below the 



error catastrophe, we obtain that, A{fi, A; K{t = 
00), ksos) = fc[e-'^(i-V2)(i + /,(^,A) + 2e''(i-^)/2 - 
2/;,_i(/./2, A)) - 1] + e-^(i-V2)(/,^_,(^, A) - fiifi, A)) - 
1, and B{n,\-R{t = 00), ksos) = k[e-^'^^-^/'^\l + 
/,,_i(m, A) + 2e''(i-^)/2 - 2fl,_^{^i/2, A)) - 1]. 

Taking Is = \ for ksos ^ 00 gives v4(/i, A; K{t = 
00), ksos) = fc[2e"^/2 - 1] - 1, and B{fi,X;R{t = 
oo),iiSOs) — fc[2e^^/^ — 1], so that R{t — 00) ~ 
fc[2e-^/2 - 1] below the error catastrophe. This result 
is identical with the semiconservative quasispecies equa- 
tions with perfect lesion repair, which makes sense, since 
here we assume that any lesion is eliminated instanta- 
neously [TD]. 

5. Optimal Cutoff 

If we assume that fc >> 1, and ksos ~^ 00, then it 
is possible to find the value of Is which maximizes the 
steady-state mean fitness R{t — 00). To do this, we define 
a normalized mean fitness (j) to be equal to R(t — 00) /k, 
and if we divide Eq. (19) by fc^, we obtain that 4> is the 
solution to, 

0^ - a{n, A; (j), Ksos)<l> - ■^/^(m, A; 0, ksos) = (23) 

where, Q!(^, A; ksos) e-A'(i-A/2) ^ 

/Km, A) -I- 2e^(i-^)/2 - 2/i,_i(M/2,A)] - 1 + 
i[e-^^'-'/'Hfis-i{fi,\) - /Km, A)) - 1], and 

/3(m,A;</.,ksos) = e-^(i-V2)[i + /,,_i(m,A) + 
2eMi-^)/2-2/,,„i(M/2,A)]-l. 

Therefore, for large k we obtain that (j) — > 
linife^oo a(M, A; (f>, Ksos), which gives, 

= g-M(i-A/2)^2e-^/2_^ 

^^-M(i-A/2) ^) _ 2/,,,_i (m/2. A)) (24) 

so that maximizing is equivalent to maximizing 
/Km,A)-2/z,_i(m/2,A). 

Now, because I must be re-set to ^s ^ 1 whenever 
we take Is < I, we can only vary Is independently 
of I whenever Is > I. In this regime, the expression 
fi{lJ,jX) — 2//g_i(M/2, A) is maximized whenever Is = 
l + l. 

In the regime where Is l£ li I is re-set to Is — 1, and so, 

/Km,A)-2/,,_i(m/2,A) = /z,_i(m,A)-2/,,_i(m/2,A) 

= -1 + m(1-A) X 

y Kl-A)]^ 1 
{k + iy. ^ 2^^ 

(25) 

and so this expression is equal to —1 for Is = 1,2, and 
then increases with successive values of Is- 

Now, because I is re-set to Zs ~ 1 for Is < Z, it follows 
that we take I = Is — \ for Zs < ' + 1- For Z = 0, we then 
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obtain that 4> is maximized over Z5 < Z + 1 for Z5 = 1, 
while when I = 1, we obtain that is maximized over 

< Z + 1 for ^5 — 1,2. For / > 2, we obtain that is 
maximized over ^ ' + 1 for Is — I + 1. 

Therefore, in any case, we can maximize (j) over Is < 
Z + 1 by taking Is — 1 + 1. Since we can maximize 
over > Z + 1 by setting Is — I + \ , it follows that (j) is 
maximized when Z5 = Z + 1. 

We reach the conclusion that, when the fitness penalty 
for having a non-viable genome is sufficiently great, the 
SOS response will confer a maximum selective advantage 
if it is activated when and only when the genome has sus- 
tained sufficient genetic damage so that it will he unviable 
without SOS repair. 



B. Stochastic simulations 

We developed stochastic simulations of a unicellular 
population capable of undergoing the SOS response, in 
order to numerically test the analytical predictions of our 
model. We consider a constant population of genomes 
that is cycled over every time step. During each cy- 
cle, every genome is allowed to replicate with a proba- 
bility K{CT,(T'}Af, where K[a-,rT'} is the first-order growth 
rate constant of genome {a, a'}, and At is the length of 
the time step. We take At to be sufficiently small so that 
the probability of a given genome replicating more than 
once during a cycle is neghgible. 

We assume that the population initially consists of a 
clonal population of wild-type (mutation-free) genomes. 
The fitness of a given genome {a, a'} is determined by 
assigning Ic,IlJrJb parameters to the ordered-pairs 
{a, a'), {u',a) with respect to the ordered-pair {ao,ao). 
The fitness is then taken to be the larger of the two fit- 
nesses associated with the two sets of parameters. 

If a genome replicates during a cycle, then it is re- 
moved from the population, and the two daughters are 
added to the population of genomes. To maintain a con- 
stant population size, another, randomly chosen genome 
is removed from the population as well. 

If a daughter genome is produced that has at least Is 
lesions, then it enters the SOS response, and is assigned 
a replication probability of 0. A genome that has initi- 
ated the SOS response continues to undergo SOS repair 
until all lesions have been removed, and a complemen- 
tary genome has been restored. During every time step, 
a genome that is undergoing the SOS response has its le- 
sions scanned, and each lesion is repaired with probabil- 
ity nsos^t- III addition to being chosen small enough so 
that the probability of a given genome replicating more 
than once during a cycle is negligible, we also choose 
At to be sufficiently small so that the probability that 
a given genome undergoing the SOS response has more 
than one lesion repaired during a cycle is also neghgible. 

The stochastic simulation is allowed to run for a suffi- 
cient number of time steps so that the mean fitness of the 
population does not change significantly, at which point 
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FIG. 2: Comparison of the mean fitnesses obtained from both 
stochastic simulations (dots) and the analytical solution (solid 
line) of our model. Parameters values are k = 9, I — 4, Is ~ 5, 
A — 0.08, Ksos ~ 100, L — 100. The population size was set 
at 1000. 
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FIG. 3: Comparison of the mean fitnesses obtained from both 
stochastic simulations (dots) and the analytical solution (solid 
line) of our model. Parameter values are k — 9, I = 4, Is ~ 5, 
A = 0.08, Ksos ~ 10, L = 100. The population size was set 
at 1000. 

the system is assumed to be at steady-state. 

Figures 2 and 3 show plots comparing the mean fitness 
obtained from the analytical solution to the mean fitness 
obtained from the stochastic simulations. As can be seen 
from the figures, the agreement between the analytical 
solution and the stochastic simulation is excellent. 



C. Conclusions and Future Research 

This paper developed a quasispecies approach for de- 
scribing the evolutionary dynamics of a unicellular pop- 
ulation that incorporated a simplified model of the SOS 
response. The model was a generalization of the single- 
fitness-peak landscape that is often used in quasispecies 
theory to study various problems in evolutionary dynam- 
ics. The model was shown to be analytically solvable. 
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and it was found that the solution led to a maximal se- 
lective advantage to the SOS response in a manner that is 
broadly consistent with the behavior of actual organisms. 

For future research, it will be important to move be- 
yond a phenomenological description of the evolutionary 
dynamics associated with the SOS response, and to con- 
sider more realistic models that will allow for quantita- 
tive models that can be used in collaboration with exper- 
iment. Nevertheless, as discussed previously, we believe 
that even this initial model could potentially be used to 
understand qualitative aspects of the SOS response. Fur- 
thermore, we believe that our model might also be use- 



ful for obtaining order-of- magnitude estimates for various 
parameters associated with the evolutionary dynamics of 
the SOS response. 
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